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I .  SUMMARY 
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In  this  report  we  summarize  our  accomplishments  in  the  research  program  J 

i 

presently  supported  by  Grant  AF0SR-82-0258  over  the  period  from  July  1.  1982  ! 

to  September  30,  1987,  with  primary  emphasis  on  the  accomplishments  from 
July  1,  1986  to  September  30,  1987.  The  basic  scope  of  this  program  is  the 
analysis,  estimation,  and  control  of  complex  systems  with  particular  emphasis 
on  (a)  the  development  of  asymptotic  methods  and  theories  for  nearly  singular 
systems:  (b)  the  investigation  of  theoretical  questions  related  to  singular 

systems:  and  (c)  the  analysis  of  complex  systems  subject  to  or  characterized 
by  sequences  of  discrete  events.  These  three  topics  are  described  in  the  next 
three  sections  of  this  report.  A  full  list  of  publications  supported  by  Grant 
AF0SR-82-0258  is  also  included. 

The  principal  investigator  for  this  effort  is  Professor  Alan  S.  Willsky, 
and  Professor  George  C.  Verghese  is  co-principal  investigator.  Professors 
Willsky  and  Verghese  were  assisted  by  several  graduate  research  assistants  as 
well  as  additional  thesis  students  not  requiring  stipend  or  tuition  support 
from  this  grant.  The  list  of  47  publications  includes  14  papers  that  have 
appeared  or  have  been  submitted  to  journals,  9  journal  papers  presently  in 
preparation,  14  papers  presented  at  conferences,  1  S.B.  thesis,  3  S.M.  theses, 
and  6  Doctoral  theses.  In  addition,  Prof.  Willsky  and  Verghese  have  been 
invited  to  give  a  number  of  lectures  on  the  results  of  these  efforts  including 
Prof.  Willsky’s  featured  invited  presentation  at  the  August  1986  SIAM 
Conference  on  Linear  Algebra  in  Signals,  Systems  and  Control. 


II.  ASYMPTOTIC  ANALYSIS  FOR  PERTURBED  SYSTEMS 

Our  previous  research  in  this  general  area  has  produced  a  number  of 
important  results  and  directions  for  further  research.  In  this  subsection  we 
review  the  basic  ideas  behind  our  work  which  is  documented  in  detail  in^  [1-4, 
7.  9.  11-13,  16.  20-21,  25-26,  30-32,  34-37,  44  and  47]. 

The  model  that  has  been  the  focus  of  much  of  our  attention  is  the 
perturbed  linear  system 

x(t)  =  A(e)x( t)  (2.1) 

where  A(e)  is  analytic  in  e  at  e  =  0.  If,  furthermore,  A(e)  loses  rank  at 
e  =  0,  (2.1)  represents  a  singularly  perturbed  system  that  may  display 
dynamics  at  several  time  scales.  Such  models  arise  in  describing  complex 
interconnected  systems  with  weak  couplings,  "stiff"  systems  with  time 
constants  ranging  over  several  orders  of  magnitude,  and  finite-state  Markov 
processes  (FSMP's)  with  rare  transitions.  In  this  latter  case  A(e)  is  an 
inf initessimally  stochastic  matrix  (i.e.,  column  sums  are  zero  and 
off-diagonal  terms  are  nonnegative)  and  x(t)  is  the  vector  of  state 
probabilities. 

Our  earliest  work  [1,  2,  4]  on  analyzing  (2.1)  used  results  on 
perturbations  of  linear  operators  [Kato  1982]  to  develop  a  general  approach  to 
determining  if  (2.1)  has  well-behaved  time  scale  structures  and,  if  so,  to 


In  this  report  we  refer  to  publications  supported  by  AFOSR  by  number,  e.g. 
[8].  References  to  other  work  are  included  in  a  second  list  and  are  referred 
to  by  author  and  year,  e.g.  [Kato  1982]. 
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construct  a  multiple  time  scale  approximation.  In  the  case  of  FSMP’s  this 
work  made  clear  the  connection  with  stochastically  discontinuous  FSMP’s  and 
provided  a  general  result  on  hierarchical  aggregation  of  perturbed  FSMP's. 

The  basic  idea  behind  the  approach  in  [1,  2,  4]  is  an  examination  of  the 
perturbed  eigenstructure  of  (2.1).  Specifically,  let  Pq(£)  denote  the 
projection  onto  the  subspace  spanned  by  the  eigenvectors  and  generalized 
eigenvectors  corresponding  to  eigenvalues  of  A(e)  that  converge  to  0  as  e  1  0. 
Then  let 

Ai(e)  =  P0(e)A(fc)/e  =  PQ{e)k{e)?0(e)/e  (2.2) 


As  discussed  in  [1,  2,  4],  A4(e)  is  analytic  at  e  =  0  if  and  only  if  A(0)  has 
semisimple  null-structure  (SSNS).  In  this  case  the  process  can  be  iterated  to 
produced  A2(e),A3(fc),  etc.  If  this  procedure  can  be  taken  to  completion,  A(e) 
is  said  to  have  multiple  scmisimple  nuIlTStructUEfi  (MSSNS).  and  if 
A(0) ,At (0) ,A2(0) , . . .are  all  semistable  (i.e.,  all  eigenvalues  strictly  in  the 
left-half  plane  except  for  possible  semisimple  zero  eigenvalues),  A(e)  is  said 
to  satisfy  the  multiple  semistabi li tv  (MSST)  condition.  In  this  case  the 
dynamics  in  (2.1)  can  be  uniformly  approximated  by  A(0) ,eAl (0) , . . . in  the  sense 
that 

lim  sup  II  exp{A(e)t}  -  ev'  elv/  ev/  ...II  =  0  (2.3) 

eiO  QO 

Furthermore,  if  A(e)  is  inf  ini tessimal ly  stochastic,  it  is  possible  after  the 
fact  to  represent  each  successive  time  scale  in  (2.3)  in  terms  of  an 
aggregated  version  of  the  FSMP  at  the  preceding  time  scale. 


While  these  results  are  quite  general,  the  price  that  apparently  is  paid 
for  this  generality  is  a  significant  increase  in  complexity  and  a 
corresponding  loss  of  simple  interpretation  when  compared  to  other  results 
developed  for  restricted  classes  of  systems.  In  particular,  the  method  in  [1, 
2,  4]  requires  the  computation  of  the  entire  e-dependent  projection  Pq(£), 
even  though  the  ultimate  objective,  as  shown  in  (2.3),  is  to  discard  all  but 
the  critical  e-dependencies  (as  embodied  in  the  matrices 
A(0) ,eAt(0) ,e2A2(0) , . . . ) .  Consequently  a  key  thrust  of  our  subsequent 
research  has  been  to  provide  a  bridge  between  our  general  results  and  previous 
simpler  ones  in  order  both  to  develop  alternate,  simpler  procedures  and  to 
pinpoint  the  precise  causes  of  increased  complexity  in  the  general  case. 

In  [7 ,  9  11-13,  20-21,  26],  we  have  exposed  the  importance  of  the 

invariant  factors  of  A(e).  viewed  as  a  matrix  over  the  ring  of  functions  of  e 
analytic  at  e  *  0.  Specifically,  consider  the  Smith  decomposition  of  A(e): 

A(e)  =  P(e)D(e)Q(e)  (2.4) 

where  |P(0) | , |Q(0) |  *0  and 

1c  k 

D(e)  =  diag(e  It . e  ny  (2.5) 

ki 

where  the  e  are  the  invariant  factors  of  A(e).  Then,  as  shown  in  [20],  the 
time-scale  analysis  of  A(e)  is  equivalent  to  that  for  D(e)A  where  A  is  the 
e-independent  matrix  Q(0)P(0).  In  taking  this  step  we  have  discarded  a 
significant  number  of  e-dependent  terms  and  have  put  the  system  into  an 
explicit  form  that  allows  us  to  make  direct  contact  with  previous  results.  In 
particular,  the  time  scales  of  the  system,  if  they  exist,  are  precisely 
determined  by  the  invariant  factors,  and  the  MSSNS  and  MSST  conditions  can  be 


directly  related  to  the  properties  of  a  sequence  of  successive  Schur 
complements  of  A.  This  approach  also  allows  us  to  make  a  stronger  and  more 
precise  statement  of  the  main  results  in  [1]  involving  in  particular  the 
notion  of  a  strong  time  scale  decomposition. 

These  results  prompted  additional  research  on  the  relationship  between 
the  invariant  factor  structure  and  eigenstructure  of  A(e).  In  particular,  in 
[9,  11  21]  we  show  that  MSSNS  is  equivalent  to  the  orders  of  the  eigenvalues 
equalling  those  of  the  invariant  factors.  Going  one  step  farther,  note  that 
the  gcd  of  all  minors  of  A(e)  of  various  orders  determine  the  invariant 
factors  of  A(e),  while  the  sums  of  principal  minors  of  each  order  specify  the 
characteristic  polynomial  of  A(e)  and  therefore  the  orders  of  the  eigenvalues. 
From  this  observation  we  find  that  MSSNS  is  equivalent  to  a  particular 
consistency  condition  among  these  integer  orders  together  with  a 
"non-cancellation"  condition  that  guarantees  that  the  leading  terms  of 
principal  minors  of  particular  orders  are  not  canceled  when  they  are  summed. 

These  conditions  also  suggest  a  related  line  of  investigation  for  which 
we  have  some  initial  results  [9,  12  26],  namely  the  use  of  amplitude  scaling 
to  modify  non-principal  minors  of  A(e)  so  that  the  MSSNS  condition  is 
satisfied.  Consider,  for  example,  the  following  system  matrix  that  does  not 
have  MSSNS: 


A(e) 


-£ 

0 


1 


-e 


(2.6) 


Note  that  the  reason  that  (2.3)  cannot  be  satisfied  is  that  the  (1 ,2)-element 
of  exp(A(e)t)  is  te 


which  has  a  maximal  value  of  order  1/e.  Consider, 


however,  a  similarity  transformation  that  scales  the  state  variables 

z(t)  =  diag(e.l)  x(t)  (2.7) 

♦ 

The  transformed  system  matrix  in  this  case  is 

("S  -t]  <2-8> 

which  does  have  MSSNS.  The  procedure  we  have  developed  identifies  diagonal 
scalings  for  a  restricted  class  of  system  matrices  by  identifying  those  minors 
of  A(e)  that  are  the  reason  for  the  violation  of  the  MSSNS  condition.  We 
expect  that  there  is  a  generalization  of  this  procedure  that  is  applicable  to 
a  far  larger  class  of  systems.  Indeed,  we  have  seen  how  our  procedure  can  be 
adapted  to  recover  the  special  cheap  control  and  high-gain  scaling  results  in 
[Sannuti  1983],  but  a  more  general  result  remains  for  the  future. 

In  [16,  25,  30-32,  36-37,  44,  47]  we  describe  a  series  of  results  that 
have  arisen  out  of  a  second  aspect  of  our  efforts  to  simplify  and  interpret 
the  results  in  [1,  2,  4],  in  this  case  for  FSMP's.  As  discussed  in  [16],  this 
line  of  research  was  motivated  by  a  desire  to  understand  the  relationship  of 
the  method  in  [1,  2,  4]  to  simpler  results  such  as  [Courtois  1977]. 
Specifically,  for  an  FSMP,  the  eigenprojection  Pq(&)  evaluated  at  t  =  0  is  the 
ergodic  projection  of  the  FSMP  corresponding  to  the  matrix  A(0) .  Instead  of 
At(e)  in  (2.2)  consider 

Ft(e)  =  P0(0)A(e)P0(0)/e  (2.9) 

Note  that  since  Pq(0)  is  an  ergodic  projection  it  can  be  written  as 

Pq(0)  =  UV  (2.10) 

where  each  column  of  U  is  the  vector  of  ergodic  probabilities  for  a  single 
ergodic  class  of  A(0) .  The  matrix  V  is  a  membership  matrix,  with  each  row 
specifying  which  states  are  in  a  particular  ergodic  class.  From  this  one  can 


deduce  that  VU  =  I  and  that 

exp{F1(e)t}  =  U  exp(Gt(e)t)V  (2.11) 

where 

GJfc)  =  VA(e)U/e  (2.12) 

corresponds  to  an  segregated  FSMP  with  one  state  corresponding  to  each  ergodic 
class  of  the  original  unperturbed  FSMP  (characterized  by  A(0)).  The  rates 
between  these  aggregates,  as  specified  by  (2.12),  represent  average  transition 
rates  from  states  in  one  ergodic  class  to  states  in  another,  with  the 
averaging  done  using  the  ergodic  probabilities  in  U. 

As  pointed  out  in  [16],  the  procedure  just  described  breaks  down  if  the 
original  FSMP  has  implicit  time  scale  behavior  resulting  from  the  existence  of 
critical  sequences  of  rare  transitions  from  one  ergodic  class  of  A(0)  to 
another.  Such  sequences,  which  arise  naturally  in  problems  such  as 
reliability  analysis  of  complex,  fault-tolerant  systems  and  queueing  analysis 
of  data  communication  networks,  correspond  to  the  existence  of  transient 
states  in  A(0) ,  and  transitions  through  such  states  are  completely  missed  by 
the  averaging  in  (2.12).  By  keeping  all  e-dependencies,  as  in  (2.2),  we  avoid 
this  problem  but  with  a  considerable  increase  in  complexity.  In  contrast,  in 
[16]  we  describe  a  method  for  computing  only  those  e-dependent  terms  that  are 
critical  in  describing  longer-term  behavior.  Specifically,  this  procedure 
involves  replacing  the  "membership  matrix"  V  in  (2.11)  by  an  e-dependent 
membership  matrix.  The  e-dependencies  in  V(e)  account  for  the  fact  that  a 
transient  state  of  A(0)  may  in  fact  provide  a  bridge  between  ergodic  classes 
of  A(0)  at  slower  time  scales,  and  thus  the  "membership"  of  this  transient 
state  must  be  split  in  an  e-dependent  way  among  the  classes  it  couples. 


There  are  several  important  features  of  this  result.  First,  the 
computations  at  each  successive  time  scale  are  performed  on  increasingly 
aggregated  processes  as  in  [Courtois  1977]  but  unlike  [1].  Secondly,  the 
result  has  a  strong  graph-theoretic  flavor  in  which  one  can  work  solely  with 
the  integer  orders  of  the  transition  rates  of  A(e.)  to  determine  what  the 
aggregated  classes  will  be  and  what  the  structures  of  V(e)  will  be.  That  is, 
using  only  simple  integer  arithmetic  we  can  determine  which  elements  of  U  and 
V(e)  are  nonzero  and  what  the  orders  are  of  the  nonzero  elements  of  V(e), 
thereby  making  the  structural  computations  extremely  robust.  Finally,  a  key 
technical  fact  used  extensively  in  this  development  is  another 
"no-cancellation  condition",  namely  the  fact  that  all  transition  rates  between 
states  are  nonnegative.  The  flows  of  probability  mass  along  two  different 
paths  from  one  state  to  another  therefore  add,  so  that  leading-order  terms  are 
never  canceled. 

We  feel  that  the  results  in  [16]  represent  an  important  breakthrough,  and 
in  fact  they  have  already  led  to  a  number  of  additional  results.  In 
particular,  we  have  developed  [25,  30]  a  corresponding  aggregation  procedure 
for  discrete-time  FSMP's.  The  interesting  aspect  of  this  result  is  that  all 
time  scales  other  than  the  fastest  are  described  by  continuous- time  FSMP’s. 
Also,  we  have  developed  [25,  31]  aggregation  results  for  a  large  class  of 
continuous- time  finite-state  semi-Markov  processes  that  go  well  beyond  any 
other  results  in  the  literature.  In  particular,  in  our  work  we  have  allowed 
both  the  transition  probabilities  and  the  holding  time  distributions  to  be 
perturbed.  By  restricting  attention  to  distributions  with  rational  Laplace 
transforms  we  are  able  to  use  the  so-called  method  of  stages  to  use  our  FSMP 


result  in  order  to  prove  the  validity  of  a  hierarchial  approximation. 

Important  aspects  of  this  work  are  (1)  the  continued  use  of  a  no-cancellation 
condition  although  the  "flow"  rates  arising  from  the  method  of  stages  are  not 
guaranteed  to  be  positive  or  even  real;  (2)  the  fact  that  the  form  of  the 
holding  time  distribution  may  lead  to  a  non-transient  state  at  one  time  scale 
being  split  between  two  aggregates  at  the  next  scale  —  a  form  of  behavior 
that  cannot  occur  in  an  FSMP. 

There  are  three  other  extensions  of  this  work  on  which  some  results  have 
been  obtained.  In  [25,  32,  44]  we  present  some  initial  results  on  applying 
our  FSMP  results  to  analyze  the  reliability  of  a  fault-tolerant  system  that 
incorporates  an  automatic  fault  detection  and  reconfiguration  system.  An 
important  question  for  such  systems  is  the  effect  fault  detection  performance 
characteristics  such  as  false  alarms,  missed  alarms,  and  detection  delays  have 
on  overall  reliability.  Variations  in  such  parameters  can  be  viewed  as 
changes  in  the  orders  of  particular  transition  rates  in  the  FSMP  describing 
the  overall  system.  In  [25,  32,  44]  we  examine  a  relatively  simple  problem  of 
this  type.  Using  the  fact  that  our  results  allow  us  to  identify  time  scale 
structure  by  examination  of  integer  orders  of  transition  rates,  we  identify 
particular  orders  for  certain  of  these  rates  that  lead  to  overall  reliability 
(as  measured  by  the  order  of  the  transition  rate  from  an  aggregate  state 
representing  "working"  to  one  representing  "not  working")  of  maximal  order. 

The  second  extension  that  we  have  considered  is  to  extend  the  method  in 
[16]  to  broader  classes  of  systems  of  the  form  of  (2.1).  In  particular,  the 
no-cancel lation  condition  and  its  flow  interpretation  suggest  possible 
generalizations.  The  one  we  have  begun  to  pursue  [25,  36]  is  to  the  class  of 
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positive  systems,  i.e.,  systems  for  which  x(t)  is  guaranteed  to  stay  in  the 
positive  orthant  if  it  begins  there.  Positive  systems  can  also  be  represented 
in  a  graphical  manner,  and  while  the  structure  of  these  systems  can  be  far 
richer  than  that  for  FSMP’s,  we  have  been  able  to  obtain  some  results  already. 
In  particular,  an  extension  to  compar tmental  models  has  been  obtained.  Also, 
note  that  not  all  positive  systems  will  satisfy  the  MSSNS  condition  (for 
example,  (2.6)  describes  a  positive  system).  However,  it  is  possible  to 
determine  if  a  positive  system  has  MSSNS  by  simple  graphical  means. 

The  third  extension  we  have  addressed  [47]  has  been  motivated  by  the 
analysis  of  flexible  manufacturing  or  inspection  and  testing  systems.  In  part 
the  work  in  [47]  is  a  direct  application  of  the  methods  of  [16,  25,  32]  to 
models  describing  such  applications  and  in  particular  to  the  identification  of 
the  relationships  among  certain  rates  that  lead  to  particular  aggregated 
structures.  These  applications  did,  however,  lead  to  one  new  theoretical 
result  motivated  by  the  fact  that  in  many  applications  key  variables  often 
take  the  form  of  counts  of  particular  sets  of  transitions  (such  as  those 
modeling  completion  of  a  part  or  of  an  inspection).  An  important  observation 
is  that  at  fast  enough  time  scales,  these  transitions  occur  as  discrete 
events.  However,  at  slower  scales,  the  states  involved  in  these  transitions 
may  be  aggregated,  and  thus  the  count  of  "internal  transitions"  among  states 
contained  in  an  aggregate  state  must  be  modeled  as  a  random  variable.  An 
asymptotically  accurate  method  for  doing  this,  building  in  part  on  results 
from  renewal  theory,  is  developed  in  [47],  We  believe  these  results  will  be 
of  significant  value  in  the  investigation  of  control  problems  for  such 
processes . 


11 


The  final  portion  of  research  in  this  area  has  dealt  with  the  analysis  of 
control  problems  associated  with  the  model 

x(t)  =  A(e)x(t)  +  B(e)u( t)  (2.13) 

In  [9,  20]  we  present  some  results  on  time  scale  modification,  i.e.,  on 
modifying  the  invariant  factors  of  A(e)  by  application  of  feedback  of  the  form 

u(t)  =  K(e)x(t)  (2.14) 

In  our  more  recent  work  in  this  area  [34,  35]  we  have  focused  on  a  detailed 
examination  of  the  controllability  structure  of  (2.13)  and  its  discrete-time 
counterpart.  The  key  to  this  analysis  is  the  Smith  decomposition  of  the 
controllability  matrix 

<€(6)  =  [B(e) ! A(e)B(e) ! . :An_1(e)B(£)]  (2.15) 

The  invariant  factors  of  this  matrix  determine  the  "orders  of  controllability" 
of  the  system,  and  the  Smith  decomposition  itself  allows  us  to  identify  a 
standard  form  for  such  systems--  order  e*"*  -  controllable  states  are  those  that 
are  in  the  range  of  *€(0),  order  e*  -  controllable  states  are  those  that  are 
either  driven  directly  by  u(t)  through  an  order  e  gain  in  B(e)  or  have  an 
order  t  coupling  with  the  order  fP  -  controllable  states  —  i.e., 

Xj  =  An(e)x1(t)  +  B1(e)u(t)  +  J  A1.(e)x.(t) 

i*2 

*2  =  +  tA2l(fc)xi(£)  +  tB2(fcMt)  +  Y  A2i(fc)xi(t) 

i>3 

—  A33(e)x3( t)  +  £A32x2  ^ £  ^  +  £  ^31xl^£^  +  Y  A3j(£)xj('”) 

1*4 

(2.16) 

These  results  allow  us  to  develop  asymptotic  methods  for  pole  placement 
via  high-gain  feedback.  In  addition,  we  have  some  initial  results  relating 


the  invariant  structure  of  (2.1)  with  Willems  notions  of  almost-invariance  for 
unperturbed  systems  [Willems  1981,  1982]. 


III.  SINGULAR  SYSTEMS 


I 

Our  recent  research  in  this  area,  as  documented  in  [17,  19,  22,  24, 

27-29,  40-43],  has  focused,  for  the  most  part,  on  the  class  of  two-point 

t 

boundarv-value  descriptor  systems  (TPBVDS’s):  1 

I 

Ex(k+1 )  =  Ax(k)  +  Bu(k)  (3.1)  | 

t 

y  =  ViX(0)  =  Vfx(N)  (3.2) 

Note  that  E  and  A  may  both  be  singular,  so  that  (3.1)  allows  one  to  model  a 

i 

large  class  of  noncausal  systems.  For  this  reason,  it  is  natural  to  analyze  ! 

I 

this  model  together  with  the  general  boundary  condition  (3.2).  Models  of  this  I 

type  and  their  extension  to  more  than  one  independent  variable  frequently 
arise  in  the  description  of  spatial  or  space-time  phenomena.  Examples  range 
from  discretized  versions  of  equations  describing  electromagnetic  fields  or 
gravitational  anomalies,  to  models  for  distributed  systems  such  as  flexible 
structures,  to  models  that  are  used  as  the  basis  for  solving  problems  in 
computational  vision  such  as  motion  estimation  and  shape  from  shading  (see,  in 
particular,  [Roug6e  1987]  in  which  the  connection  between  this  last  class  of 
problems  and  boundary-value  models  is  made  explicit). 

Motivated  by  the  wealth  of  potential  signal  and  image  processing 
applications,  we  began  our  investigation  in  this  area  with  the  study  of 
estimation  problems  for  (3.1),  (3.2)  and  also  for  a  particular  class  of  2-D 
models  (i.e.,  models  with  two  independent  variables)  [17,  19,  22].  In 
particular,  in  [19]  we  analyze  the  problem  of  estimating  x(k)  in  (3.1),  (3.2) 


given  the  interior  observations 


y(k)  =  Cx(k)  +  r(k)  ,  kt[l.K-l]  (3.3) 

and  the  boundary  measurements 

yb  =  W.x(O)  +  Wfx(N)  +  rb  (3.4) 

Using  the  method  of  complementary  models  (see  [19].  [Adams,  et  al ,  1984]  and 
[Weiner t  and  Desai,  1981])  we  derived  a  generalized  Hamiltonian  form  for  the 
optimal  estimator: 


[  E  -BQB" 

'  x(k+l) 

A 

0 

■  x(k)  ' 

0 

- 1 

< 

i 

o 

.  Mk+1) 

— 

-C'R-1  C 

-E* 

.  Mk)  . 

+ 

.  C'R_1y(k)  . 

(3.5) 


with  appropriate  boundary  conditions. 

Two  points  of  importance  in  this  specification  are  that  (a)  the  optimal 
estimator  itself  is  a  TPBVDS;  and  (b)  in  the  standard  causal  system  case 
(E  =  Vj  =1.  V^.  =  0)  (3.5)  reduces  to  the  usual  Hamiltonian  form  for  the 
optimal  smoother.  The  first  point  raises  the  question  of  finding  methods  for 
solving  these  implicit  equations,  while  the  latter  suggests  a  possible 
approach  to  their  solution.  In  particular,  as  discussed  in  [Kailath  and 
Ljung,  1982]  and  [Adams,  et  al,  1984],  in  the  causal  case  it  is  possible  to 
block-diagonal ize  or  triangular ize  the  Hamiltonian  dynamics,  yielding  a 
variety  of  smoothing  algorithms  including  those  of  Mayne  and  Fraser  and  of 
Rauch,  Tung,  and  Striebel.  The  specification  of  the  tranformations  needed  to 
obtain  such  algorithms  leads  directly  to  Riccati  equations,  whose  properties 
can  in  turn  be  directly  related  to  properties  of  the  original  system  (e.g. 
reachability  and  observability)  and  of  the  estimator  (e.g.  its  error 
covariance  and  stability). 


Motivated  by  this  line  of  research  for  causal  systems,  we  began  an 
analogous  investigation  for  the  estimator  (3.5)  for  a  TPBVDS.  As  described  in 
[19],  the  possible  singularity  of  E  and  A  makes  this  a  more  complex  problem. 

In  particular,  the  approach  that  exactly  parallels  one  used  in  the  causal  case 
does  not  work  for  many  TPBVDS's.  On  the  other  hand,  we  have  discovered  two 
new  generalized  Riccati  equations 

0  =  A' (E0_1E’  +  BQB 1 )_1A  +  C’R-1C  (3.6) 

<f>  =  A(E'4>-1E  +  C‘R-1C)-1A'  +  BQB1  (3.7) 

which,  if  solutions  exist,  then  provide  the  basis  for  block  diagonal izat ion  of 
(3.5).  In  [40,  41]  we  present  some  results  on  the  existence  and  uniqueness  of 
solutions  to  such  equations  and  their  relation  to  properties  of  reachability, 
observability,  and  stability.  To  do  this,  of  course,  it  is  necessary  to 
define  and  study  system-theoretic  concepts  such  as  these  for  TDBVDSs,  and  it 
was  this  necessity  that  led  to  the  extensive  set  of  results  described  in 
[24,  27-29,  40.  42,  43]  and  which  we  now  discuss. 

In  [24,  27-29,  40,  42,  43]  we  describe  the  results  of  our  research  to 
date  in  developing  a  system  theory  for  TPBVDS’s.  O^r  line  of  investigation 
has  been  strongly  motivated  by  the  work  of  Krener  [1980,  1985a,  1985b]  who  has 
investigated  the  class  of  standard  (i.e,  not  descriptor)  continuous-time 
boundary-value  systems 

x(t)  =  Ax(t)  +  Bu(t)  (3-8) 

v  =  ViX(0)  +  Vfx(T)  (3.9) 

Part  of  our  work  has  paralleled  that  of  Krener,  with  notable  differences 
because  of  the  potential  singularity  of  both  E  and  A.  In  addition,  our 
interest  in  the  smoothing  problem  and  in  particular  in  its  efficient  solution 


has  led  us  to  investigate  other  topics  such  as  stability  and  recursive 
solutions  for  TPBVDS’s. 


One  of  the  basic  results  derived  in  [27]  and  used  heavily  throughout  our 
work  is  the  following.  Suppose  that  (zE-A)  is  a  regular  pencil  (i.e.,  its 
determinant  is  not  identically  zero).  Then  it  is  possible  to  transform  (3.1), 
(3.2)  so  that  E  and  A  commute.  Well-posedness  then  is  equivalent  to  the 
invertibility  of  (V^E^  +  V^A^).  In  this  case,  we  can  always  put  the  system  in 
normlized  form,  so  that 

N  N 

V^1  +  VfA‘  =  I  (3.10) 

oE  +  /3A  =  I  (3.11) 

for  some  pair  of  real  numbers  a  and  0.  As  discussed  in  [27],  equation  (3.11) 
greatly  simplifies  many  results  connected  with  TPBVDS’s.  For  example,  there 
is  a  much  simpler  statement  of  a  generalized  Cayley-Hamilton  theorem  for  E  and 
A  in  this  case,  and  this  in  turn  leads  to  simpler  reachability  and 
observability  results  than  were  available  previously. 

As  in  Krener’s  development,  we  have  explored  two  notions  of  recursion  for 
TPBVDS's,  namely  inward  from  and  outward  toward  the  boundaries,  and  for  each 
there  are  associated  concepts  of  reachability  and  observability.  In 
particular,  in  [27]  we  define  am  inward  process  z^k.t),  k  <  L,  which  plays  a 
role  similar  to  the  state  of  a  causal  system  in  that  it  represents  the 
boundary  condition  (rather  than  initial  condition)  propogated  inward  to  k  and 
L  from  0  and  N  using  the  intervening  input  values  (i.e.,  u(j)  for  0<[j<k, 
t£j<N).  The  outward  process  zo(k,t),  on  the  other  hamd,  summarizes  all  that 
one  needs  to  know  about  the  input  between  k  amd  t  in  order  to  determine  x 
outside  the  interval.  While  these  processes  are  similar  in  spirit  to  those  of 


Krener,  the  possible  singularity  of  E  and  A  leads  to  some  differences  and  some 
additional  complexity.  For  example,  in  Krener’s  context  zq  represents  the 
difference  between  the  actual  value  of  x  at  one  end  of  an  interval  and  the 
value  predicted  for  x  at  that  point  given  x  at  the  other  end  of  the  interval 
and  assuming  zero  input  inside  the  interval.  In  our  context,  we  cannot  in 
general  predict  in  either  direction,  and  therefore  a  modified  definition  must 
be  developed. 

As  indicated  previously,  there  are  two  pairs  of  notions  of  reachability 

and  observability.  Strong  reachability  refers  to  the  ability  to  drive  z  (k.O 

o 

to  any  desired  value,  while  weak  reachability  deals  with  z^fk.C).  Strong 
observability,  on  the  other  hand,  refers  to  the  ability  to  determine  z^(k,t) 
based  on  observations  of  u  and  y  between  k  and  i,  while  weak  observability 
corresponds  to  our  ability  to  determine  zo(k,t)  based  on  knowledge  of  u  and  y 
outside  the  interval  [k,4].  In  [27}  we  derive  conditions  for  each  of  these 
properties  and  in  particular  provide  justification  for  the  terminology 
adopted.  In  addition,  we  also  describe  several  methods  for  the  efficient 
solution  of  TPBVDS's.  In  particular,  one  method,  which  is  similar  in  spirit 
to  two-filter  solutions  to  smoothing  problems,  involves  the  simultaneous 
outward-recursive  computation  of  zq  and  inward-recursive  computation  of  z 
The  solution  x  can  then  be  computed  from  these  quantities.  A  second  solution 
method,  similar  in  form  to  the  serial  structure  of  the  Rauch-Tung-Str iebel 
algorithm,  consists  first  of  the  outward-recursive  computation  of  z^,  followed 
by  the  direct  inward-recursive  computation  of  x. 

As  is  the  case  for  causal  systems,  many  of  the  results  for  TPBVDS’s  are 
simplified  and  can  be  carried  farther  for  the  class  of  stationary  TPBVDS’s, 


i.e.P  systems  as  in  (3.1),  (3.2)  but  for  which  the  effect  u(k)  has  on  x(L)  is 
a  function  only  of  L  -  k.  In  contrast  to  the  causal  case,  (3.1).  (3.2)  is  not 
stationary  for  arbitrary  choices  of  the  constant  matrices  A,  B,  V  ,  and  V^. . 

In  [24,  28]  we  define  the  class  of  stationary  TPBVDS’s  as  the  set  of  models  in 
(3.1),  (3.2)  for  which  and  V^.  each  commute  with  E  and  A  and  for  which 

ker(En)  C  ker^)  ,  ker(An)  C  ker  (Vf)  (3.12) 

(where  n  =  dim  x).  As  discussed  in  [24.  28]  the  technical  condition  (3.12) 
which  can  always  be  imposed  with  a  modification  in  system  behavior  only  near 
the  boundaries  (and  is  always  true  if  E  and  A  are  invertible)  provides  some 
additional  regularity  and.  in  particular,  allows  us  to  view  (3.1),  (3.2)  as 
the  restriction  of  a  TPBVDS  defined  on  a  larger  interval.  This  definition  of 
stationarity  also  allows  us  to  obtain  simplified  conditions  for  weak 
reachability  and  observability  (the  conditions  for  strong  reachability  and 
observability  are  simple  even  for  nonstationary  systems)  and  to  characterize 
minimal  realizations  in  a  fashion  that  is  exactly  analogous  to  Krener’s 
results.  In  particular,  a  stationary  TPBVDS  is  a  minimal  such  realization  of 
a  given  weighting  pattern  if  and  only  if  the  system  is  weakly  reachable  and 
observable  and  the  kernel  of  the  strong  observability  matrix  (i.e.,  the  set  of 
"strongly  unobservable  states")  is  contained  in  the  range  of  the  strong 
reachability  matrix  (i.e.,  the  set  of  "strongly  reachable  states").  Also,  as 
in  [Krener  1985a],  two  minimal  realizations  need  not  be  related  only  by  a 
similarity  transformation,  thanks  to  some  flexibility  that  may  exist  in  the 
choice  of  and  V^. . 

In  addition  to  the  result  just  cited,  it  is  also  possible  to  carry  out 
additional  investigations  for  stationary  TPBVDS's.  In  particular,  in  [24,  29] 
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we  consider  two  definitions  of  stability  for  stationary  TPBVDS’s.  Perhaps  the 
more  interesting  of  the  two  requires  the  boundary  value  v  to  have  an 
asymptotically  varnishing  effect  on  the  process  x  at  points  .ar  from  the 
boundary  as  the  boundaries  recede  toward  ±  °°.  (Compare  this  with  the  notion 
of  stability  for  causal  systems  in  which  we  require  the  effect  of  the  initial 
conditions  to  vanish  asymptotically.)  The  conditions  for  a  stationary  TPBVDS 
to  be  stable  involve  both  the  eigenstructure  of  {E,A}  and  the  structure  of  V 
and  Vf .  In  particular,  it  is  always  possible  to  perform  trams format ions  on 
(3.1)  so  that 
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(this  is  a  variation  on  the  Kronecker  form  for  a  regular  pencil  [Van  Dooren, 
1979]),  where  A^.  and  A^  both  have  all  their  eigenvalues  inside  the  unit  circle 
and  U  has  all  its  eigenvalues  on  the  unit  circle.  In  this  case  the  conditions 
of  stationarity  require  that  V^,  V^.  also  be  block  diagonal,  i.e., 

V4  =  diag(V11.Vi2.Vi3)  ,  Vf  =  diag(Vfl,Vf2.Vf3)  (3.14) 

Stability,  in  the  sense  just  defined,  then  is  equivalent  to  the  absence  of  the 
third  blocks  in  (3.13),  (3.14)  (i.e.,|zE-A|  must  be  nonzero  on  the  unit 
circle)  together  with  the  invertibility  of  V.^  and  V^.  Roughly  speaking,  the 
first  block  of  (3.13)  corresponds  to  modes  that  have  stable  propogation  as  k 
increases  auid  the  invertibility  of  W  ^  requires  that  all  of  these  modes  be 
constrained  at  k  =  0.  Note  that  this  does  not  mean  that  these  modes  are 
causal  since  need  not  be  zero.  A  similar  interpretation  can  be  given  for 
the  second  block. 
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Another  topic  investigated  in  [24,  29]  is  stochastic  stationari ty . 
Specifically,  consider  a  stationary  TPBVDS  (3.1),  (3.2)  where  v  is  a  zero-mean 
random  vector  with  covariance  Q  and  u(k)  is  a  zero-mean  white  noise  sequence, 
independent  of  v,  with  variance  I.  This  system  is  stochastically  stationary 
—  i.e.,  the  correlation  matrix  £[x(k)x(£)']  depends  only  on  k  -  L  —  if  and 
only  if  Q  satisfies  the  generalized  Lyapunov  equation 

EQE'  -  AQA*  =  V^B’V’  -  VfBB’V’  (3.15) 

The  constant  covariance,  P,  of  x(k)  in  this  case  satisfies  a  second 
generalized  Lyapunov  equation 

EPE*  -  APA*  =  V.E^B'CV.eV  -  V^BB^Vj-eV  (3.16) 

Also,  in  this  case  it  is  possible  to  derive  a  second-order  matrix  TPBVDS 
(analogous  to  one  derived  by  Krener  in  his  study)  whose  solution  yields  the 
correlation  matrix  for  x. 

Finally,  as  in  the  causal  case,  there  are  results  relating  Lyapunov 
equations,  stochastic  stationarity,  and  stability,  although  at  present  the 
theory  is  not  complete.  In  the  causal  case  we  know  that  if  a  system  is 
reachable  from  the  noise,  then  stability  is  equivalent  to  the  existence  of  a 
positive  definite  solution  to  the  system’s  Lyapunov  equation,  and  this 
solution  represents  the  initial  state  covariance  that  leads  to  a  stationary 
state  process.  For  TPBVDS,  stability  is  equivalent  to  the  existence  of  a 
positive  definite  solution  to  (3.16)  if  the  system  is  strongly  reachable. 
However,  even  in  this  case,  there  may  or  may  not  exist  a  Q  satisfying  (3.15), 
so  the  relationship  of  stability  and  stochastic  stationarity  is  not  as  simple 
as  in  the  causal  case.  In  fact,  if  the  system  is  only  weakly  reachable,  it  is 
possible  for  x  to  be  stochastically  stationary  even  if  the  system  is  not 
stable.  A  complete  clarification  of  these  points  remains  for  the  future. 


IV.  SYSTEMS  SUBJECT  TO  DISCRETE  EVENTS 


The  general  theme  of  this  portion  of  our  research  has  been  the 
development  of  estimation  and  detection  algorithms  for  several  classes  of 
systems  subject  to  discrete  events.  The  area  of  discrete-event  dynamics  is 
one  that  is  presently  undergoing  a  dramatic  increase  in  attention  by  the 
research  community,  as  it  has  been  recognized  that  many  estimation  and  control 
problems  for  systems  and  processes  of  great  complexity  have  a  definite 
discrete  flavor.  Major  questions  that  arise  in  this  context  include:  (l) 
what  kinds  of  models  and  theories  should  be  developed?  and  (2)  how  do  we 
develop  methods  capable  of  dealing  with  the  complexity  of  many  of  these 
problems?  This  latter  question  provided  the  principal  motivation  for  the 
research  described  in  Section  II  on  multiple  time  scale/aggregation  methods 
for  certain  classes  of  discrete-state  processes. 

Our  work  to  date  in  this  last  portion  of  our  research  has  been  motivated 
by  the  first  question.  Specifically,  the  concept  of  a  system  with  discrete 
events  is  so  broad,  it  is  possible  to  imagine  a  large  number  of  alternate 
mathematical  settings  that  might  be  candidates  for  exploration.  Consequently, 
one  must  look  carefully  at  the  potential  applications  in  order  to  choose 
meaningful  formulations.  This  has  been  our  approach  .jid  in  particular  we  have 
focused  to  date  on  two  problem  areas  and  have  recently  initiated  efforts  in  a 
third.  The  first  of  these,  failure  detection  in  dynamic  systems,  is  perhaps 
the  simplest  discrete-event  problem,  as  one  is  interested  in  detecting 
individual,  isolated  events  in  ordinarily  continuously-evolving  systems.  Our 


second  research  direction,  the  development  of  suboptimal  distributed 
estimation  algorithms  for  coupled  finite-state  processes  with  applications  in 
electrocardiogram  (EOG)  analysis,  involves  considerably  more  complex  event 
tracking  problems  and  exposes  a  number  of  important  issues  in  the  monitoring 
of  complex  discrete-event  processes.  Finally,  the  third  area  in  which  we  have 
begun  research  is  the  development  of  sys tem- theoretic  concepts  for 
discrete-event  systems  described  by  nondeterministic  models  of  the  type 
introduced  by  Wonham  (see,  for  example,  [Vaz  and  Wonham  1986]). 

Our  work  in  failure  detection  has  had  three  components  described  in  [38], 
[33],  [14],  and  [23].  One  of  the  major  problems  in  practical  failure 
detection  is  robustness.  Indeed  one  can  argue  that  this  problem  is  even  more 
challenging  in  the  failure  detection  context  than  in  the  control  context  since 
in  the  failure  detection  problem  one  is  typically  trying  to  generate  signals 
that  are  maximally  sensitive  to  some  effects  (failures)  and  minimally 
sensitive  to  others  (model  uncertainties).  This  issue  motivated  the  research 
described  in  [38]  on  the  generation  of  robust  redundancy  relations  for  failure 
detection. 

A  redundancy  relation  or  parity  vector  for  a  perfectly  known  linear 


system 


x(k+l)  =  Ax(k)  +  Bu(k) 
y(k)  =  Cx(k)  +  Du(k) 

is  a  vector  v  so  that 


(4.1) 

(4.2) 


(4.3) 
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for  some  s  ^  0.  As  described  in  [38]  one  can  use  v  to  construct  a  linear 
combination  of  the  (s  +  1)  most  recent  values  of  the  output  and  input  that 
will  be  identically  zero  if  (4.1),  (4.2)  are  precisely  correct.  Such  pari  tv 
checks  can  then  be  used  to  detect  discrepancies  between  the  actual  data  and 
that  predicted  by  the  model. 

Clearly  any  model  uncertainty  or  noise  will  contribute  to  this 
discrepancy,  reducing  the  value  of  a  parity  check  for  discriminating  between 
normal  and  failed  behavior.  In  [38]  we  investigate  the  problem  of  maximizing 
this  discrimination  capability  taking  uncertainty  and  noise  into  account.  For 
example,  suppose  that  the  parameters  of  the  A,  B,  C,  D  are  uncertain  and  in 
particular  can  take  on  one  of  N  sets  of  values  indexed  by  i.  Then  a  criterion 
capturing  the  desire  to  keep  the  left-hand  side  of  (4.3)  small  over  all 
possible  model  parameters  is  the  following: 


J  =  l 


(4.4) 


where 


(4.5) 


As  discussed  in  [38],  minimizing  (4.4),  or  its  generalization  to  consider  a 
set  of  several  orthogonal  parity  vectors,  can  be  accomplished  through  a 


singular  value  decomposition  of 


z= 


(4.6) 


Specifically  the  singular  values  of  Z  indicate  the  level  of  robustness  of 


corresponding  parity  checks.  For  example,  the  left  singular  vector 


corresponding  to  the  smallest  singular  value  of  Z  is  the  most  robust  parity 
check.  Several  important  variations  on  this  problem  are  also  considered  in 
[38].  In  particular,  it  is  possible  to  define  a  statistical  version  of  (4.4) 
in  order  to  incorporate  both  the  effects  of  noise  and  the  relative  magnitude 
of  the  state  variables  as  measured  by  the  state  covariance.  Also,  it  is 
possible  to  formulate  a  similar  problem  in  which  we  want  the  parity  checks  to 

be  large  when  a  failure  occurs: 

N  2N 

J  =  J  llvTZ.II2  -  ^  llvTZ.II2  (4.7) 

i=l  i=N+l 

where  the  values  of  the  index  i  from  N  +  1  through  2N  correspond  to  the 
uncertain  system  dynamics  when  a  particular  failure  has  occured. 

While  the  results  in  [38]  are  of  significance,  they  fall  short  of  the 
complete  robust  failure  detection  theory  we  would  like  to  develop. 
Specifically,  the  discrete,  parametric  specification  of  model  uncertainty  (the 
discrete  aspect  of  which  can  be  relaxed)  is  restrictive.  In  particular  it 
would  be  desirable  to  have  a  robustness  theory  that  can  handle  model 
uncertainty  specified  in  terms  of  frequency  response  error  bounds.  Also, 
there  is  the  issue  of  designing  the  actual  failure  detection  residual 
generation  system.  In  particular,  the  method  in  [38]  determines  a  set  of 
parity  relations,  which,  as  discussed  in  [Chow  and  Willsky  1984],  can  then  be 
used  in  a  number  of  ways.  For  example,  one  method  of  residual  generation 
consists  of  simply  computing  the  finite  window  parity  checks  determined  by 
these  relations.  For  a  variety  of  reasons,  such  as  noise  rejection  and 
enhancement  of  failure  effects  for  detection  and  discrimination  from  other 
failures,  it  may  be  preferable  to  generate  "closed-loop"  residuals  based  on 


the  dynamic  models  specified  by  the  parity  relations  —  i.e.  to  design  Kalman 
filters  or  observers  based  on  the  dynamic  relationships  specified  by  the 
parity  relations. 

These  observations  provided  the  motivation  for  the  investigation 
described  in  [33].  While  this  paper  does  not  deal  with  the  robustness  issue, 
it  does  establish  a  linear  system- theoretic  framework  for  the  design  of 
failure  detection  systems  and  in  particular  makes  clear  the  connections  with 
the  geometric  and  frequency  domain  theories  of  linear  systems.  The  specific 
formulation  used  in  [33]  is  the  following 

k 

x(t)  =  Ax(t)  +  Bu(t)  +  ^  Ljin.(t)  (4.8) 

i=l 

y(t)  =  Cx( t)  (4.9) 

where  the  matrix  Lj  models  the  way  in  which  the  ith  failure  mode  affects  the 

dynamics  and  m^t)  is  an  arbitrary  waveform  modeling  the  actual  ith  failure 

time  history  (see  [33]  for  examples  of  how  various  sensor  and  actuator 

failures  cam  be  modeled  in  this  way).  The  objective  then  is  to  design  a 

residual  generation  system 

w(t)  =  Fw(t)  -  Ey( t )  +  Gu( t)  (4.10) 

r.(t)  =  M.w(t)  -  H.y(t)  +  K.u(t)  ,i=l . p  (4.11) 

so  that  r^(t)  =  0  Vi  if  there  is  no  failure  and  also  so  that  the  jth  failure 

mode  affects  only  a  subset  of  the  residual  vectors,  specifically  r,  (t),  ktQ  , 

K  J 

where  the  coding  sets  Q  C  {l,...,p}  are  distinct,  i.e.  0.  t  Q.  for  i  /  j  .  If 

J  ^  J 

this  can  be  accomplished,  then  failure  detection  and  identification  reduces  to 
a  determination  of  the  set  of  residuals  that  deviate  significantly  from  zero. 


■  mum  *  imp  KfTirTwnwfmn^wrwn^^Tir»« 


Note  that  for  this  approach  to  be  effective,  we  would  also  like  to  make 

sure  that  r^(t),  kefi^  actually  do.  deviate  from  zero  when  the  jth  failure 

occurs.  This  is  equivalent  to  the  inver t ibi 1 i ty  of  the  systems  defined  from 

m.(t)  as  input  to  the  set  of  signals  r,  (t).  kefl .  as  output.  As  discussed  in' 

J  j 

[33],  this  is  too  restrictive  a  condition,  and  we  settle  for  the  less 

restrictive  condition  of  input  observability,  i.e.  that  the  columns  of  the 

transfer  matrix  from  m.(t)  to  the  vector  (r,  (t).  kefi.)  are  linearly 

J  k  j 

independent  over  the  field  of  real  numbers  (see  [33]  for  a  discussion).  Also, 
as  developed  in  the  paper,  the  basic  problem  on  which  all  of  the  analysis 
builds  is  the  fundamental  problem  in  residual  eeneration  (FPRG) . 

Specifically,  suppose  there  are  only  two  failure  modes  in  (4.8),  i.e.  k  =  2. 
The  FPRG  is  to  design  a  residual  generator 


w(t)  =  Fw(t)  -  Ey(t)  +  Gu(t)  (4.12) 

r(t)  =  Mw(t)  -  Hy(t)  +  Ku(t)  (4.13) 

so  that  the  map  from  (u,  mg)  to  r  is  zero  and  so  that  the  map  from  m^  to  r  is 
input  invertible.  There  are  some  strong  similarities  between  this  problem  and 
feedback  design  problems  such  as  decoupling,  so  it  is  not  suprising  that  duals 
of  familiar  constructs  in  geometric  system  theory  play  an  important  role.  In 
particular,  the  concept  of  an  unobservabi 1 i tv  subspace  (see  [33])  is  crucial. 
In  particular,  the  FPRG  has  a  solution  if  and  only  if  the  intersection  of  the 
range  of  ,  and  the  smallest  unobservability  subspace  containing  the  range  of 
Lg  is  (0).  Using  this  basic  result,  several  other  problems  are  solved  in 
[33],  including  the  problem  of  designing  residual  generators  capable  of 
distinguishing  a  set  of  failures  under  the  restrictive  assumption  that 
simultaneous  failures  may  occur  and  the  less  restrictive  situation  in  which 
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only  a  single  failure  can  occur.  In  addition,  frequency  domain 
interpretations  of  these  results  are  given.  It  is  important  to  note  that 
these  results  are  not  simple  dualizations  of  existing  results  in  geometric 
system  theory. 

As  mentioned  previously,  the  other  aspect  of  our  work  in  discrete-event 
systems,  which  is  described  in  [8],  [10],  [15],  and  [18],  has  been  on  the 
development  of  distributed  estimation  algorithms  for  a  class  of  coupled 
finite-state  processes.  The  original  motivation  for  this  investigation  was  to 
develop  a  class  of  event-based  models  that  is  appropriate  for  describing 
cardiac  behavior  and  for  serving  as  the  basis  for  EOG  rhythm  analysis.  This 
class  of  distributed  models,  however,  has  much  broader  potential  applicability 
to  other  distributed  monitoring  and  situation  assessment  problems. 

As  discussed  in  [15]  and  [18],  each  portion  of  the  heart  can  be  viewed  as 
cycling  through  a  set  of  discrete  states  corresponding  to  the  electrical 
events  that  result  in  muscle  contraction,  recovery,  and  rest.  The  timing  of 
these  events  is  occasional ly  and  dramatically  affected  by  the  occurrence  of 
particular  events  in  other  portions  of  the  heart.  This  structure  led  us  to 
develop  a  model  structure  consisting  of  a  set  of  N  interacting  subprocesses 
each  characterized  by  a  state  x.  taking  values  in  a  finite  set.  The  overall 

process,  with  state  {Xj^ . x^}  is  an  FSMP  possessing,  however,  a  great  deal 

of  structure.  In  particular,  conditioned  on  the  present  values  of  all  of  the 
subprocesses,  the  transition  behavior  of  each  subprocess  is  independent  of 
that  of  the  others.  Furthermore,  while  the  transition  probabilities  of 
subprocess  i  depend  on  the  values  of  { x j | j  ^  i}.  there  are  far  fewer  values 
for  these  probabilities  than  there  are  possible  sets  of  states  of  the  other 


processes.  That  is,  the  transition  probabilities  of  subprocess  i  depend  on  an 
interaction  variable  h^(x^,j  ^  i)  that  takes  on  only  a  few  values.  This 
allows  us  to  model  the  presence  of  strong  interactions  of  only  a  few  distinct 
types  among  the  subsystems,  which  is  a  fundamental  characteristic  of  many 
large-scale  systems. 

One  important  aspect  of  the  ECG  and  many  other  problems  is  that  the  event 
state  is  not  observed  directly.  Rather  one  observes  signals  that  cam  be 
viewed  as  am  encoding  of  particular  key  tramsitions  in  one  or  more  of  the 
subprocesses  (in  the  ECG  case  these  transitions  correspond  to  the  initiation 
of  muscular  contractions  amd  recoveries  resulting  in  the  waveforms  seen  in  the 
ECG).  Consequently,  we  see  that  the  problem  is  fundamentally  one  of  finite 
state  estimation  or  decoding. 

A  fundamental  premise  in  [18]  is  that  the  optimal  estimator  camnot  be 
implemented  (for  computational  reasons  as  in  ECG  amalysis  or  for  reasons  of 
geographic  separation  as  in  distributed  battle  management) ,  auid  that  one 
wishes  to  design  a  distributed  estimator  consisting  of  interacting  processes 
each  responsible  for  estimating  the  state  of  a  single  subprocess.  There  are 
several  critical  questions  that  must  be  dealt  with  in  designing  an  estimator 
with  this  structure.  In  particular,  the  processor  for  a  specific  subprocess 
must  have  some,  hopefully  reduced,  model  for  the  remainder  of  the  system. 

Also,  there  typically  is  a  need  for  processors  to  exchange  information,  amd 
the  questions  that  arise  are  what  information  should  be  exchanged  amd  how 
should  the  quality  of  this  information  be  modeled.  In  [18]  we  describe  one 
suboptimal  but  systematic  way  in  which  to  deal  with  each  of  these  issues.  In 
particular,  since  the  interaction  variable  h^  has  a  set  of  values  of  low 
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cardinality,  we  have  developed  a  method  for  specifying  an  FSMP  on  this  set 
that  approximates  the  behavior  of  h^  (which  is  not  an  FSMP).  Also,  it  is 
natural  for  the  other  processes  to  provide  processor  i  with  an  estimate  of  h. 
obtained  from  the  state  estimates  of  the  other  subprocesses.  Since  all  of  the 
processors  are  dynamic  systems  themselves,  it  is  essential  that  processor  i 
uses  a  dynamic  model  for  the  information  it  receives  from  the  other 
processors.  Again,  one  systematic  approach  to  this  is  described  in  [18]. 

Also,  as  discussed  in  [18],  there  is  the  important  issue  of  performance 
analysis  for  such  finite  state  estimators.  It  is  argued  that  examination  of 
estimation  errors  at  individual  points  in  time  is  not  appropriate  in  this 
case,  as  a  small  error  in  the  timing  of  particular  events,  while  yielding 
large  point-to-point  errors  may  actually  be  of  high  quality  when  event 
sequences  are  compared.  Again  one  simple  approach  to  capturing  such  a  dynamic 
error  measure  is  described  in  [18]  (see  also  [10]),  but  much  more  needs  to  be 
done  in  this  area. 

Furthermore,  the  distributed  nature  of  our  EOG  model,  with  its  emphasis 
on  timing  and  control,  has  led  us  to  investigate  an  alternate  modeling 
framework  —  stochastic- timed  Petri  nets  —  that  offers  some  advantages  in 
terms  of  the  compactness  of  the  representation  and  the  fact  that  timing  and 
structural  aspects  of  the  model  can  be  described  separately.  The  results  of 
this  effort  will  be  described  in  [40]. 

Finally  there  has  been  a  flurry  of  research  on  discrete-event  dynamic 
systems  modeled  as  finite  state  machines  or  as  extended  state  machines  in 
which  time  and  temporal  logic  can  be  incorporated  (see,  for  example,  [Vaz  and 
Wonham  1986]).  In  [45,  46]  we  present  our  initial  research  efforts  in  this 


30 


area.  In  particular,  we  have  developed  a  notion  of  stability  for 
nondeterministic  automata  and  an  associated  notion  of  stabi 1 izabi 1 i ty  when 
control  is  included.  We  provide  a  procedure  for  determining  if  such  a  system 
is  stabilizable  and  for  constructing  stabilizing  controllers.  A  second  aspect 
of  our  work  is  motivated  by  the  clear  need  for  aggregate  models  for  such 
systems  if  realistic  applications  are  to  be  considered.  In  particular  we  have 
developed  the  notion  of  a  task,  consisting  of  a  set  of  state  transitions, 
described  controllers  to  implement  individual  tasks,  and  analyzed  the  joining 
of  these  primitive  controllers  to  implement  sets  of  tasks.  This  provides  the 
basis  for  considering  a  higher-level  description  of  a  discrete-event  system  in 
which  transitions  at  the  higher  level  correspond  to  the  completion  of  tasks  at 
the  lower  level. 
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